AITopics | Curaçao

Collaborating Authors

Curaçao

World Cup 2026: Small nations Big Dreams

Al JazeeraDec-5-2025, 19:56:24 GMT

Curacao, Cape Verde and Haiti have more going on behind the scenes than your average national team and still made it to the 2026 World Cup. Samantha Johnson looks at their journey and what lies ahead for them in football's biggest showpiece tournament. Why does Israel play in European Football? What's behind bans on away fans? Afghan Women's Team: The Fight to Play

artificial intelligence, game theory, world cup 2026, (4 more...)

Al Jazeera

Country:

North America > Haiti (0.27)
North America > Curaçao (0.27)
Asia > Middle East > Israel (0.27)
(10 more...)

Industry: Leisure & Entertainment > Games > Computer Games (0.40)

Technology:

Information Technology > Game Theory (0.43)
Information Technology > Artificial Intelligence > Games (0.40)

Add feedback

Revisiting Noise in Natural Language Processing for Computational Social Science

Borenstein, Nadav

arXiv.org Artificial IntelligenceMar-10-2025

Computational Social Science (CSS) is an emerging field driven by the unprecedented availability of human-generated content for researchers. This field, however, presents a unique set of challenges due to the nature of the theories and datasets it explores, including highly subjective tasks and complex, unstructured textual corpora. Among these challenges, one of the less well-studied topics is the pervasive presence of noise. This thesis aims to address this gap in the literature by presenting a series of interconnected case studies that examine different manifestations of noise in CSS. These include character-level errors following the OCR processing of historical records, archaic language, inconsistencies in annotations for subjective and ambiguous tasks, and even noise and biases introduced by large language models during content generation. This thesis challenges the conventional notion that noise in CSS is inherently harmful or useless. Rather, it argues that certain forms of noise can encode meaningful information that is invaluable for advancing CSS research, such as the unique communication styles of individuals or the culture-dependent nature of datasets and tasks. Further, this thesis highlights the importance of nuance in dealing with noise and the considerations CSS researchers must address when encountering it, demonstrating that different types of noise require distinct strategies.

camembert-ft-sq-fr camembert-ft-sq-fr 54 54 52, convenient qualitative analysis and visualisation, hedonism pleasure and sensuous gratification, (16 more...)

arXiv.org Artificial Intelligence

2503.07395

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Poland (0.14)
Europe > Finland (0.14)
(130 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
(2 more...)

Industry:

Media > News (1.00)
Leisure & Entertainment (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(10 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(4 more...)

Add feedback

Lawful and Accountable Personal Data Processing with GDPR-based Access and Usage Control in Distributed Systems

van Binsbergen, L. Thomas, Steketee, Marten C., Kebede, Milen G., Janssen, Heleen L., van Engers, Tom M.

arXiv.org Artificial IntelligenceMar-10-2025

Compliance with the GDPR privacy regulation places a significant burden on organisations regarding the handling of personal data. The perceived efforts and risks of complying with the GDPR further increase when data processing activities span across organisational boundaries, as is the case in both small-scale data sharing settings and in large-scale international data spaces. This paper addresses these concerns by proposing a case-generic method for automated normative reasoning that establishes legal arguments for the lawfulness of data processing activities. The arguments are established on the basis of case-specific legal qualifications made by privacy experts, bringing the human in the loop. The obtained expert system promotes transparency and accountability, remains adaptable to extended or altered interpretations of the GDPR, and integrates into novel or existing distributed data processing systems. This result is achieved by defining a formal ontology and semantics for automated normative reasoning based on an analysis of the purpose-limitation principle of the GDPR. The ontology and semantics are implemented in eFLINT, a domain-specific language for specifying and reasoning with norms. The XACML architecture standard, applicable to both access and usage control, is extended, demonstrating how GDPR-based normative reasoning can integrate into (existing, distributed) systems for data processing. The resulting system is designed and critically assessed in reference to requirements extracted from the GPDR.

controller, processing action, relation, (15 more...)

arXiv.org Artificial Intelligence

2503.07172

Country:

Europe > Austria > Vienna (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Switzerland (0.04)
(5 more...)

Genre: Research Report (0.40)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.94)

Add feedback

GIMMICK -- Globally Inclusive Multimodal Multitask Cultural Knowledge Benchmarking

Schneider, Florian, Holtermann, Carolin, Biemann, Chris, Lauscher, Anne

arXiv.org Artificial IntelligenceFeb-19-2025

Large Vision-Language Models (LVLMs) have recently gained attention due to their distinctive performance and broad applicability. While it has been previously shown that their efficacy in usage scenarios involving non-Western contexts falls short, existing studies are limited in scope, covering just a narrow range of cultures, focusing exclusively on a small number of cultural aspects, or evaluating a limited selection of models on a single task only. Towards globally inclusive LVLM research, we introduce GIMMICK, an extensive multimodal benchmark designed to assess a broad spectrum of cultural knowledge across 144 countries representing six global macro-regions. GIMMICK comprises six tasks built upon three new datasets that span 728 unique cultural events or facets on which we evaluated 20 LVLMs and 11 LLMs, including five proprietary and 26 open-weight models of all sizes. We systematically examine (1) regional cultural biases, (2) the influence of model size, (3) input modalities, and (4) external cues. Our analyses reveal strong biases toward Western cultures across models and tasks and highlight strong correlations between model size and performance, as well as the effectiveness of multimodal input and external geographic cues. We further find that models have more knowledge of tangible than intangible aspects (e.g., food vs. rituals) and that they excel in recognizing broad cultural origins but struggle with a more nuanced understanding.

copyrigth, internvl2, knowledge, (13 more...)

arXiv.org Artificial Intelligence

2502.13766

Country:

South America > Colombia (0.28)
Africa > Republic of the Congo (0.28)
Europe > Germany (0.14)
(169 more...)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Health & Medicine (1.00)
Media (0.92)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

iLOCO: Distribution-Free Inference for Feature Interactions

Little, Camille, Zheng, Lili, Allen, Genevera

arXiv.org Machine LearningFeb-10-2025

Feature importance measures are widely studied and are essential for understanding model behavior, guiding feature selection, and enhancing interpretability. However, many machine learning fitted models involve complex, higher-order interactions between features. Existing feature importance metrics fail to capture these higher-order effects while existing interaction metrics often suffer from limited applicability or excessive computation; no methods exist to conduct statistical inference for feature interactions. To bridge this gap, we first propose a new model-agnostic metric, interaction Leave-One-Covariate-Out iLOCO, for measuring the importance of higher-order feature interactions. Next, we leverage recent advances in LOCO inference to develop distribution-free and assumption-light confidence intervals for our iLOCO metric. To address computational challenges, we also introduce an ensemble learning method for calculating the iLOCO metric and confidence intervals that we show is both computationally and statistically efficient. We validate our iLOCO metric and our confidence intervals on both synthetic and real data sets, showing that our approach outperforms existing methods and provides the first inferential approach to detecting feature interactions.

artificial intelligence, interaction, machine learning, (16 more...)

arXiv.org Machine Learning

2502.06661

Country:

North America > Curaçao (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)

Genre: Research Report (0.64)

Industry:

Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (0.46)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

A Video-grounded Dialogue Dataset and Metric for Event-driven Activities

Imrattanatrai, Wiradee, Asada, Masaki, Hasegawa, Kimihiro, Cheng, Zhi-Qi, Fukuda, Ken, Mitamura, Teruko

arXiv.org Artificial IntelligenceJan-30-2025

This paper presents VDAct, a dataset for a Video-grounded Dialogue on Event-driven Activities, alongside VDEval, a session-based context evaluation metric specially designed for the task. Unlike existing datasets, VDAct includes longer and more complex video sequences that depict a variety of event-driven activities that require advanced contextual understanding for accurate response generation. The dataset comprises 3,000 dialogues with over 30,000 question-and-answer pairs, derived from 1,000 videos with diverse activity scenarios. VDAct displays a notably challenging characteristic due to its broad spectrum of activity scenarios and wide range of question types. Empirical studies on state-of-the-art vision foundation models highlight their limitations in addressing certain question types on our dataset. Furthermore, VDEval, which integrates dialogue session history and video content summaries extracted from our supplementary Knowledge Graphs to evaluate individual responses, demonstrates a significantly higher correlation with human assessments on the VDAct dataset than existing evaluation metrics that rely solely on the context of single dialogue turns.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.18324

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > Curaçao (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Sports > Soccer (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Communications (0.93)
(2 more...)

Add feedback

Application of AI-based Models for Online Fraud Detection and Analysis

Papasavva, Antonis, Johnson, Shane, Lowther, Ed, Lundrigan, Samantha, Mariconti, Enrico, Markovska, Anna, Tuptuk, Nilufer

arXiv.org Artificial IntelligenceSep-25-2024

Fraud is a prevalent offence that extends beyond financial loss, causing psychological and physical harm to victims. The advancements in online communication technologies alowed for online fraud to thrive in this vast network, with fraudsters increasingly using these channels for deception. With the progression of technologies like AI, there is a growing concern that fraud will scale up, using sophisticated methods, like deep-fakes in phishing campaigns, all generated by language generation models like ChatGPT. However, the application of AI in detecting and analyzing online fraud remains understudied. We conduct a Systematic Literature Review on AI and NLP techniques for online fraud detection. The review adhered the PRISMA-ScR protocol, with eligibility criteria including relevance to online fraud, use of text data, and AI methodologies. We screened 2,457 academic records, 350 met our eligibility criteria, and included 223. We report the state-of-the-art NLP techniques for analysing various online fraud categories; the training data sources; the NLP algorithms and models built; and the performance metrics employed for model evaluation. We find that current research on online fraud is divided into various scam activitiesand identify 16 different frauds that researchers focus on. This SLR enhances the academic understanding of AI-based detection methods for online fraud and offers insights for policymakers, law enforcement, and businesses on safeguarding against such activities. We conclude that focusing on specific scams lacks generalization, as multiple models are required for different fraud types. The evolving nature of scams limits the effectiveness of models trained on outdated data. We also identify issues in data limitations, training bias reporting, and selective presentation of metrics in model performance reporting, which can lead to potential biases in model evaluation.

information retrieval, large language model, machine learning, (23 more...)

arXiv.org Artificial Intelligence

2409.19022

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(11 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
(5 more...)

Add feedback

SeaSplat: Representing Underwater Scenes with 3D Gaussian Splatting and a Physically Grounded Image Formation Model

Yang, Daniel, Leonard, John J., Girdhar, Yogesh

arXiv.org Artificial IntelligenceSep-25-2024

We introduce SeaSplat, a method to enable real-time rendering of underwater scenes leveraging recent advances in 3D radiance fields. Underwater scenes are challenging visual environments, as rendering through a medium such as water introduces both range and color dependent effects on image capture. We constrain 3D Gaussian Splatting (3DGS), a recent advance in radiance fields enabling rapid training and real-time rendering of full 3D scenes, with a physically grounded underwater image formation model. Applying SeaSplat to the real-world scenes from SeaThru-NeRF dataset, a scene collected by an underwater vehicle in the US Virgin Islands, and simulation-degraded real-world scenes, not only do we see increased quantitative performance on rendering novel viewpoints from the scene with the medium present, but are also able to recover the underlying true color of the scene and restore renders to be without the presence of the intervening medium. We show that the underwater image formation helps learn scene structure, with better depth maps, as well as show that our improvements maintain the significant computational improvements afforded by leveraging a 3D Gaussian representation.

gaussian representation, seasplat, view synthesis, (12 more...)

arXiv.org Artificial Intelligence

2409.17345

Country:

North America > US Virgin Islands (0.24)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > Panama (0.04)
(10 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

CaLMQA: Exploring culturally specific long-form question answering across 23 languages

Arora, Shane, Karpinska, Marzena, Chen, Hung-Ting, Bhattacharjee, Ipsita, Iyyer, Mohit, Choi, Eunsol

arXiv.org Artificial IntelligenceJul-3-2024

Large language models (LLMs) are used for long-form question answering (LFQA), which requires them to generate paragraph-length answers to complex questions. While LFQA has been well-studied in English, this research has not been extended to other languages. To bridge this gap, we introduce CaLMQA, a collection of 1.5K complex culturally specific questions spanning 23 languages and 51 culturally agnostic questions translated from English into 22 other languages. We define culturally specific questions as those uniquely or more likely to be asked by people from cultures associated with the question's language. We collect naturally-occurring questions from community web forums and hire native speakers to write questions to cover under-resourced, rarely-studied languages such as Fijian and Kirundi. Our dataset contains diverse, complex questions that reflect cultural topics (e.g. traditions, laws, news) and the language usage of native speakers. We automatically evaluate a suite of open- and closed-source models on CaLMQA by detecting incorrect language and token repetitions in answers, and observe that the quality of LLM-generated answers degrades significantly for some low-resource languages. Lastly, we perform human evaluation on a subset of models and languages. Manual evaluation reveals that model performance is significantly worse for culturally specific questions than for culturally agnostic questions. Our findings highlight the need for further research in non-English LFQA and provide an evaluation framework.

annotator, culturally agnostic question, culturally specific question, (13 more...)

arXiv.org Artificial Intelligence

2406.17761

Country:

Africa > Niger (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
Europe > Russia (0.04)
(43 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area (0.97)
Banking & Finance (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback

MIRAI: Evaluating LLM Agents for Event Forecasting

Ye, Chenchen, Hu, Ziniu, Deng, Yihe, Huang, Zijie, Ma, Mingyu Derek, Zhu, Yanqiao, Wang, Wei

arXiv.org Artificial IntelligenceJul-1-2024

Recent advancements in Large Language Models (LLMs) have empowered LLM agents to autonomously collect world information, over which to conduct reasoning to solve complex problems. Given this capability, increasing interests have been put into employing LLM agents for predicting international events, which can influence decision-making and shape policy development on an international scale. Despite such a growing interest, there is a lack of a rigorous benchmark of LLM agents' forecasting capability and reliability. To address this gap, we introduce MIRAI, a novel benchmark designed to systematically evaluate LLM agents as temporal forecasters in the context of international events. Our benchmark features an agentic environment with tools for accessing an extensive database of historical, structured events and textual news articles. We refine the GDELT event database with careful cleaning and parsing to curate a series of relational prediction tasks with varying forecasting horizons, assessing LLM agents' abilities from short-term to long-term forecasting. We further implement APIs to enable LLM agents to utilize different tools via a code-based interface. In summary, MIRAI comprehensively evaluates the agents' capabilities in three dimensions: 1) autonomously source and integrate critical information from large global databases; 2) write codes using domain-specific APIs and libraries for tool-use; and 3) jointly reason over historical knowledge from diverse formats and time to accurately predict future events. Through comprehensive benchmarking, we aim to establish a reliable framework for assessing the capabilities of LLM agents in forecasting international events, thereby contributing to the development of more accurate and trustworthy models for international relation analysis.

cameocode, isocode, relation, (15 more...)

arXiv.org Artificial Intelligence

2407.01231

Country:

Asia > North Korea (0.14)
Oceania > Australia > Australian Indian Ocean Territories > Territory of Cocos (Keeling) Islands (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
(234 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Law (1.00)
Government > Foreign Policy (1.00)
Government > Military (0.93)
Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback